A Survey On Facilitating Document Annotation Using Content And Querying Value

نویسنده

  • Sonal Nikam
چکیده

A bulk data is generated in different organization which is in textual format. In such text structured information is get shadowed in unstructured text. Current algorithms working on constructing information from raw data , but they are not cost effective and sometimes shows impure result set especially when they are working on text with lacking of knowledge about exact arrangement of text data. We proposed two new technique that facilitates the generation of structured metadata by identifying documents that are likely to contain information of user interest and this information is going to be useful for querying the database find exact information/document. Here people will likely to assign metadata related to documents which they upload which will easily help the users in retrieving the documents. Our approach relies on the idea that humans are more likely to add the necessary metadata while creating any document, if prompted by the interface; or that it is much easier for humans (and/or algorithms) to identify the metadata when such information actually exists in the document, instead of naively prompting users to fill in forms with information that is not available in the document. As a part of the system major modules discover structured attributes and interesting knowledge or features about the document , by using 2 techniques jointly utilizing the a. Content of the text and the b. Query Such algorithms fetching knowledge out of raw data are considering words and their frequency count but not the phrases or typical sequence of words. As a part of our contribution we introduce a technique i.e. phrase extraction. This technique extract typical sequence of words to construct knowledge from raw data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Tags Re-ranking Using Multi-level Features in Automatic Image Annotation

Automatic image annotation is a process in which computer systems automatically assign the textual tags related with visual content to a query image. In most cases, inappropriate tags generated by the users as well as the images without any tags among the challenges available in this field have a negative effect on the query's result. In this paper, a new method is presented for automatic image...

متن کامل

A survey on Automatic Text Summarization

Text summarization endeavors to produce a summary version of a text, while maintaining the original ideas. The textual content on the web, in particular, is growing at an exponential rate. The ability to decipher through such massive amount of data, in order to extract the useful information, is a major undertaking and requires an automatic mechanism to aid with the extant repository of informa...

متن کامل

Use of Multimedia Input in Automated Image Annotation and Content-Based Retrieval

This research explores the interaction of linguistic and photographic information in an integrated text/image database. By utilizing linguistic descriptions of a picture (speech and text input) coordinated with pointing references to the picture, we extract information useful in two aspects: image interpretation and image retrieval. In the image interpretation phase, objects and regions mention...

متن کامل

Contribution à la modélisation et à l'interrogation de documents multimedia par les metadonnées

The aim of this paper is to present a proposal for multimedia documents annotation, based on modeling and unifying features elicited from content and structure mining. Our approach relies on the availability of annotated metadata representing segment content and structure as well as segment transcripts. Temporal and spatial operators are also taken into account when annotating documents. Any fe...

متن کامل

Modified Bayesian Algorithm in Data Mining

Document Annotation is the task of adding metadata information in the document which is useful in information extraction. Document annotation has emerged as a different stream in data mining. Majority of algorithms are concentrated on query workload. This paper uses Probing algorithm with Bayesian approach which identifies the attribute based on query workload, text frequency and content of the...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014